European Clinical Case Corpus

نویسندگان

چکیده

Abstract Interpreting information in medical documents has become one of the most relevant application areas for language technologies. However, despite fact that huge amounts (e. g., examination reports, hospital discharge letters, digital records) are produced, their availability research purposes is still limited, due to strict data protection regulations. Aiming at fostering advanced extraction technologies applications, we present E3C, a corpus clinical case narratives fully based on freely licensed documents. E3C (European Clinical Case Corpus) contains vast selection cases (i. e., presenting patient’s history) cover different areas, styles and produced languages. A portion been manually annotated be used training testing purposes, while larger set automatically tagged serve as baseline future extraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Sentence-Aligned European Patent Corpus

This paper describes the creation and the content of the Sentence-Aligned European Patent Corpus. The corpus contains more than 130 million sentence pairs for 6 European languages. With more than 76 million sentence pairs, to our knowledge, the EN-DE sub corpus is the largest bilingual sentence-aligned corpus. For other language pairs, work has started to obtain sub corpora of similar size. The...

متن کامل

DCEP -Digital Corpus of the European Parliament

We are presenting a new highly multilingual document-aligned parallel corpus called DCEP Digital Corpus of the European Parliament. It consists of various document types covering a wide range of subject domains. With a total of 1.37 billion words in 23 languages (253 language pairs), gathered in the course of ten years, this is the largest single release of documents by a European Union institu...

متن کامل

corpus stylistics and translation toni morrisons beloved as a case study

سبک شناسی به عنوان روشی جهت فهم و برداشت از یک متن ادبی با رویکردهای متفاوتی به تحلیل متن می پردازد که در نتیجه آن شاخه های مختلفی از سبک شناسی به وجود آمده است از جمله سبک شناسی فرمالیستی، سبک شناسی اداراکی، سبک شناسی مبتنی بر کورپس و غیره. این تحقیق در نخستین گام به دنبال نشان دادن اهمیت و سودمندی سبک شناسی مبتنی بر کورپس در تحلیل و بررسی ویژگی های سبکی یک اثر ادبی است و برای این منظور به صور...

Priberam Compressive Summarization Corpus: A New Multi-Document Summarization Corpus for European Portuguese

In this paper, we introduce the Priberam Compressive Summarization Corpus, a new multi-document summarization corpus for European Portuguese. The corpus follows the format of the summarization corpora for English in recent DUC and TAC conferences. It contains 80 manually chosen topics referring to events occurred between 2010 and 2013. Each topic contains 10 news stories from major Portuguese n...

متن کامل

Tagging a Corpus of Interpreted Speeches: the European Parliament Interpreting Corpus (EPIC)

The performance of three different taggers (Treetagger, Freeling and GRAMPAL) is evaluated on three different languages, i.e. English, Italian and Spanish. The materials are transcripts from the European Parliament Interpreting Corpus (EPIC), a corpus of original (source) and simultaneously interpreted (target) speeches. Owing to the oral nature of our materials and to the specific characterist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Cognitive technologies

سال: 2022

ISSN: ['2197-6635', '1611-2482']

DOI: https://doi.org/10.1007/978-3-031-17258-8_17